Data-driven Formant Synthesis of Speaker Age
نویسنده
چکیده
This paper briefly describes the development of a research tool for analysis of speaker age using data-driven formant synthesis. A prototype system was developed to automatically extract 23 acoustic parameters from the Swedish word ‘själen’ [ˈɧɛːlən] (the soul) spoken by four differently aged female speakers of the same dialect and family, and to generate synthetic copies. Functions for parameter adjustment as well as audio-visual comparison of the natural and synthesised words using waveforms and spectrograms were added to improve the synthesised words. Age-weighted linear parameter interpolation was then used to synthesise a target age anywhere between the ages of 2 source speakers. After an initial evaluation, the system was further improved and extended. A second evaluation indicated that speaker age may be successfully synthesised using data-driven formant synthesis and weighted lienar interpolation.
منابع مشابه
F0 and Segment Duration in Formant Synthesis of Speaker Age
This paper describes the work with F0 and segment duration when developing a prototype system for analysis of speaker age using data-driven formant synthesis. The system was developed to extract 23 parameters from the test words—spoken by four differently aged female speakers of the same dialect and family—and to generate synthetic copies. Audio-visual feedback enabled the user to compare the n...
متن کاملExploring data driven parametric synthesis
This paper describes our work on building a formant synthesis system based on both rule generated and database driven methods. Three parametric synthesis systems are discussed: our traditional rule based system, a speaker adapted system, and finally a gesture system. The gesture system is a further development of the adapted system in that it includes concatenated formant gestures from a data-d...
متن کاملFormant analysis and synthesis using hidden Markov models
This paper describes a unifying framework for both formant tracking and speech synthesis using Hidden Markov Models (HMM). The feature vector in the HMM is composed by the first three formant frequencies, their bandwidths and their delta with time. Speech is synthesized by generating the most likely sequence of feature vectors from a HMM, trained with a set of sentences from a given speaker. Hi...
متن کاملFormant diphone parameter extraction utilising a labelled single-speaker database
This paper examines a method for formant parameter extraction from a labeled single speaker database for use in a formantparameter diphone-concatenation speech synthesis system. This procedure commences with an initial formant analysis of the labelled database, which is then used to obtain formant (F1-F5) probability spaces for each phoneme. These probability spaces guide a more careful speaker...
متن کاملTowards synthesis of speaker age: A perceptual study with natural, synthesized and resynthesized stimuli
As a first step towards synthesis of speaker age the hypothesis that spectral cues may be more important for age perception than F0 and duration was tested in a pilot listening experiment with male speaker stimuli consisting of natural, synthesized and resynthesized isolated words. Results indicate that spectral information is dominant over pitch as cues for age. Slow speech rate also seems to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006